perm filename NEWSEQ.DOC[COM,LSP] blob sn#635646 filedate 1982-01-20 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002
C00024 00003	Random thought on NEWSEQ:
C00045 ENDMK
CāŠ—;

                          PROPOSAL FOR KEYWORD-STYLE
                      SEQUENCE FUNCTIONS FOR COMMON LISP

                               Scott E. Fahlman

                          Carnegie-Mellon University

                               January 17, 1982

A problem in the version of Common Lisp proposed in the Swiss Cheese edition is
that the number of sequence and list grovelling functions is very large.  Prior
to  the November meeting I proposed that the number of these functions could be
greatly reduced through the judicious use of keywords to  select  some  of  the
options that otherwise would tend to multiply the number of built-in functions.
In this paper I will attempt to fill in the details of such a scheme.

First, let us assume that some scheme is adopted whereby keywords (symbols with
a  colon  in fromt of them) always evaluate to themselves.  Thus, one can write
:FOO rather than ':FOO.  This is not a necessary part of this  scheme,  but  it
does make functions containing keywords look a lot cleaner.

Second,  I  have  come around to the view that the type-specific forms of these
functions should not be part of the language as it is seen by  the  user;  only
the  generic forms should be supplied.  If the user wants the greatest possible
execution speed or better error checking, he  can  declare  the  types  of  the
sequence  arguments, either by storing these arguments in variables declared to
be of a specific type or by the use of the new THE construct (whatever name  we
end  up giving it).  The compiler is then free make whatever use it can of this
additional information in order to generate  better  code.    Thus,  one  would
normally write

    (length foo)

where FOO can be any type of sequence, but could also use the form

    (length (the :string foo))

for faster execution when the code is compiled.

[rpg: But what about interpreted code? Some people need to run interpreted
code to debug, so do you propose that the interpreter understand THE? This
won't buy any speed. I think we want the type specific ones just for this
reason. Remember, *you* don't have the S-1.]

In these generic functions, symbols are not acceptable as sequences and are not
coerced  into  strings.  I see no good way to make this feature compatible with
generics of this type.

Sticking to generics helps a lot in reducing the number of functions, but there
are still too many.  Most of the functions that do some sort of search  through
the  members  of  a  sequence  come in groups of five (or sometimes only three)
functions, according to what test or comparison function is to be used.  On top
of that, many of the searches can  be  performed  either  forward  or  backward
through  the  sequence,  multiplying the number of functions by two.  These are
the options that I propose to handle with keywords rather than with  individual
functions.

The  following  functions would be supplied only in generic form and would have
no need for keywords.  They would remain as  documented  in  the  Swiss  Cheese
edition, except that the type-specific forms would be eliminated:  ELT, SETELT,
SUBSEQ,   COPYSEQ,  LENGTH,  FILL,  REVERSE,  NREVERSE,  SOME,  EVERY,  NOTANY,
NOTEVERY, SORT, STABLE-SORT, and MERGE.  TO-LIST, TO-VECTOR, and friends  would
also be provided in generic form only and would take no keywords.

REPLACE  would  probably  also  remain in its current form, though there are so
many optional args that we might want to turn  this  into  a  keyword  function
along the lines described below.  (Opinions?)

The MAP function would take a function argument, then a type-specifier argument
indicating  the  type  of  output  sequence  that  is to be returned.  (If this
argument is (), return no sequence, just discard the  results.)    Next,  there
would  be  any number of sequence arguments, which may be a mixture of sequence
types.  By requiring that the output type be explicitly specified, we eliminate
the ugly problem of finding a nice coercion  hierarchy.    Map  would  take  no
keywords.

The  CONCAT  function  would  take any number of sequence arguments of the same
type and would return a new sequence of that type.  If mixed sequences  are  to
be  concatenated,  they  must  coerced  to  the  proper type with the TO-mumble
functions.  The compiler can be smart about eliminating  unnecessary  steps  in
this situation.

Idea: Suppose we eliminate CONCAT altogether, and generalize the TO-LIST family
of functions to take multiple arguments.  These arguments would be sequences of
any  mixture of types, and their elements would be concatenated into a sequence
of the type specified in the function name.   (To  accommodate  all  the  funny
types  of  vectors,  we  might  also  want  a  general TO function that takes a
required  type-specifier  argument  before  the  sequences.)    With  a  single
argument,  these  would  do  a simple coercion, as before; if there is only one
argument and it is already of the right type,  it  is  returned  without  being
copied.    The  point is that the type of the output sequence is specified with
this scheme, and the system does not have to  try  to  figure  out  "the  least
general sequence type among those the implementation provides which can contain
the elements of all the argument sequences", whatever that means.  Opinions?

[rpg: This overloads TO-mumble too much, providing a functionality without
a corresponding mnemonic name.]

Now  for  the  functions that require keywords.  The general idea is to replace
each group of three or five related functions with a single function that would
take a couple of required arguments  followed  by  optional  keywords.    These
functions  would  use  the  good  names  (e.g.   "REMOVE" instead of "REM") and
without any keywords would default to forward direction and EQUAL test.   Thus,
they  upward  compatible  with existing functions in Maclisp.  (Were it not for
this compatibility issue, EQL would have been a better default,  but  a  change
here would cause more trouble than it is worth.)

Under  this  scheme  there will be no need for the old EQ forms since REMQ, for
example, is just REMOVE with the :COMPARE #'EQ option, but we  should  probably
include  the  common ones for convenience and compatibility.  REMQ, MEMQ, DELQ,
and ASSQ are probably worth the redundancy.

The following keywords will be  used  in  a  consistent  way  through  all  the
keyword-taking functions.  Not all of the keywords are applicable to all of the
functions  --  see  the  individual  function  descriptions for a list of which
keywords are applicable.

:from-end       This keyword takes no arguments.  It turns  any  function  into
                the corresponding "from-end" version.  If it is not present, of
                course, the direction defaults to forward.

The following four keywords are mutually exclusive:

:compare predicate 
                The  predicate  is the two-argument comparison function that is
                to be used in testing items against members  of  the  sequence.
                The default predicate is EQUAL.

:compare-not predicate 
                Like  :compare,  but  the result of the comparison predicate is
                negated before it is used.

:if predicate   The predicate is a one-argument function that is used  to  test
                members of the sequence.

:if-not predicate 
                Like  :if, but the result of the predicate is negated before it
                is used.

:start index    If present, the  examination  or  processing  of  the  sequence
                begins here.  Else, processing begins at index 0.  If there are
                two  sequence  arguments  and  the  :start  keyword is used, it
                provides a starting index for both sequences.

:end index      If present, the examination or processing of the sequence  ends
                at  this  index  (not inclusive).  Else, processing proceeds to
                the end of the sequence.  If there are two  sequence  arguments
                and  the  :end keyword is used, it provides an ending index for
                both sequences.

If there are two sequence arguments and they are to be provided with  different
                indices,  use  :start1,  :start2, :end1, and :end2.  All take a
                single index argument.

A particularly ugly case would look like this:

    (position foo-element (the :vector foo-vector)
              :compare-not #'eql :start 10 :end 20)

A more typical case would look like this:

    (position foo-element foo-vector :compare #'eql)

The following sequence functions use the above keywords:

REMOVE item sequence &optional count <keywords> 
                Accepts :from-end, :compare, :compare-not, :if, :if-not.

POSITION item sequence <keywords> 
                Accepts  :from-end,  :compare,  :compare-not,   :if,   :if-not,
                :start, :end.

COUNT item sequence <keywords> 
                Accepts   :from-end,   :compare,  :compare-not,  :if,  :if-not,
                :start, :end.

MISMATCH sequence1, sequence2 <keywords> 
                Accepts  :from-end,  :compare,  :compare-not,   :start,   :end,
                :start1, :start2, :end1, end2.

MAXPREFIX sequence1, sequence2 <keywords> 
                Accepts   :from-end,   :compare,  :compare-not,  :start,  :end,
                :start1, :start2, :end1, end2.

MAXSUFFIX sequence1, sequence2 <keywords> 
                Accepts  :from-end,  :compare,  :compare-not,   :start,   :end,
                :start1, :start2, :end1, end2.

SEARCH sequence1, sequence2 <keywords> 
                Accepts   :from-end,   :compare,  :compare-not,  :start,  :end,
                :start1, :start2, :end1, end2.

In addition to the sequence functions listed  above,  the  following  list-only
functions  can use the same keyword scheme to get rid of the groups of three or
five functions.

MEMBER item list <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

DELETE item list &optional count <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

ADJOIN item list <keywords> 
                Accepts :compare, :compare-not.

UNION &rest lists <keywords> 
                Accepts :compare, :compare-not.

NUNION &rest lists <keywords> 
                Accepts :compare, :compare-not.

INTERSECTION &rest lists <keywords> 
                Accepts :compare, :compare-not.

NINTERSECTION &rest lists <keywords> 
                Accepts :compare, :compare-not.

SET-DIFFERENCE list1 list2 <keywords> 
                Accepts :compare, :compare-not.

NSET-DIFFERENCE list1 list2 <keywords> 
                Accepts :compare, :compare-not.

SET-XOR list1 list2 <keywords> 
                Accepts :compare, :compare-not.

NSET-XOR list1 list2 <keywords> 
                Accepts :compare, :compare-not.

ASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

RASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

MEMASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

MEMRASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

POSASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

POSRASSOC item alist <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

DELASSOC item alist &optional count <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

DELRASSOC item alist &optional count <keywords> 
                Accepts :compare, :compare-not, :if, :if-not.

Idea:  Add  the  keyword  :eq, which is short for :compare #'eq.  This could be
used in any form that accepts the :compare keyword.  Similarly, add :neq, :eql,
:neql, :equal, :nequal, :equalp, and :nequalp.  This would eliminate unpleasant
verbosity in the vast majority of cases, though the added keywords  might  slow
things down a bit in interpreted code.  Opinions?

[rpg: This argument adds weight to the functional position macro idea, which is
to put all the keywords in a macro in the lambda position, which then expands
to some awful internal name that can that displacedly stick around.]


One  awkward  issue  is  raised by the use of :if and :if-not.  Since these are
one-place predicates, the item argument is not used.   It  would  beautify  the
code  to  eliminate  the item argument if the :if keyword is present.  Thus, we
would have

    (member some-list :if #'ugly-element-p)

instead of

    (member ignored-arg some-list :if #'ugly-element-p)

Unfortunately,  it  is   mechanically   very   awkward   to   squeeze   out   a
normally-required   argument   that  precedes  other  required  args,  and  the
documentation might prove quite confusing to the user.  The alternatives  would
seem  to  be  keeping  the ignored argument (the user would put an explicit NIL
there) or flushing the :if construct altogether.  I  lean  toward  the  latter.
Comments?  Ideas?

[rpg: The macro idea fixes this too, I think. The functional macro idea is
also nice because the keywords describe the function to use in the
function call more than the function call. 
Having te keyword more closely associated with the object they modify seems
like a good thing. This proposal mixes up arguments to a function with
a description of the function. The above example would be:
    ((member :if #'ugly-element-p) some-list)]
Random thought on NEWSEQ:
	. . .

Second,  I  have  come around to the view that the type-specific forms of these
functions should not be part of the language as it is seen by  the  user;  only
the  generic forms should be supplied.  If the user wants the greatest possible
execution speed or better error checking, he  can  declare  the  types  of  the
sequence  arguments, either by storing these arguments in variables declared to
be of a specific type or by the use of the new THE construct (whatever name  we
end  up giving it).  The compiler is then free make whatever use it can of this
additional information in order to generate  better  code.    Thus,  one  would
normally write

    (length foo)

where FOO can be any type of sequence, but could also use the form

    (length (the :string foo))

for faster execution when the code is compiled.

[rpg: But what about interpreted code? Some people need to run interpreted
code to debug, so do you propose that the interpreter understand THE? This
won't buy any speed. I think we want the type specific ones just for this
reason. Remember, *you* don't have the S-1.]

		. . . 

Idea: Suppose we eliminate CONCAT altogether, and generalize the TO-LIST family
of functions to take multiple arguments.  These arguments would be sequences of
any  mixture of types, and their elements would be concatenated into a sequence
of the type specified in the function name.   (To  accommodate  all  the  funny
types  of  vectors,  we  might  also  want  a  general TO function that takes a
required  type-specifier  argument  before  the  sequences.)    With  a  single
argument,  these  would  do  a simple coercion, as before; if there is only one
argument and it is already of the right type,  it  is  returned  without  being
copied.    The  point is that the type of the output sequence is specified with
this scheme, and the system does not have to  try  to  figure  out  "the  least
general sequence type among those the implementation provides which can contain
the elements of all the argument sequences", whatever that means.  Opinions?

[rpg: This overloads TO-mumble too much, providing a functionality without
a corresponding mnemonic name.]

Idea:  Add  the  keyword  :eq, which is short for :compare #'eq.  This could be
used in any form that accepts the :compare keyword.  Similarly, add :neq, :eql,
:neql, :equal, :nequal, :equalp, and :nequalp.  This would eliminate unpleasant
verbosity in the vast majority of cases, though the added keywords  might  slow
things down a bit in interpreted code.  Opinions?

[rpg: This argument adds weight to the functional position macro idea, which is
to put all the keywords in a macro in the lambda position, which then expands
to some awful internal name that can that displacedly stick around.]

		. . .

One  awkward  issue  is  raised by the use of :if and :if-not.  Since these are
one-place predicates, the item argument is not used.   It  would  beautify  the
code  to  eliminate  the item argument if the :if keyword is present.  Thus, we
would have

    (member some-list :if #'ugly-element-p)

instead of

    (member ignored-arg some-list :if #'ugly-element-p)

Unfortunately,  it  is   mechanically   very   awkward   to   squeeze   out   a
normally-required   argument   that  precedes  other  required  args,  and  the
documentation might prove quite confusing to the user.  The alternatives  would
seem  to  be  keeping  the ignored argument (the user would put an explicit NIL
there) or flushing the :if construct altogether.  I  lean  toward  the  latter.
Comments?  Ideas?

[rpg: The macro idea fixes this too, I think. The functional macro idea is
also nice because the keywords describe the function to use in the
function call more than the function call. 
Having te keyword more closely associated with the object they modify seems
like a good thing. This proposal mixes up arguments to a function with
a description of the function. The above example would be:
    ((member :if #'ugly-element-p) some-list)]
			-rpg-